R Programming

Yiwen Zhang
Aug 26 2014

Why R?

“R is really important to the point that it is hard to overvalue it. It allows statisticians to do very intricate and complicated analyses without knowing the blood and guts of computing systems.”

— Daryl Pregibon, a research scientist at Google

Pros and Cons

Pros:

  • Open source
  • Abundant statistical packages
  • Simplicity
  • Data visualization
  • Nice IDE

Cons: it could be Slow

  • Interpreted language
  • Memory management

Memory Management

  • Object size: object_size()

    • Every length 0 vector occupies 40 bytes of memory.
  • Memory usage and garbage collection: mem_used(), mem_changed()

  • Modification in place

    • Primitive vs non-priminitive functions
    • for loops could be slow

A Good IDE Matters!

Alt text

R studio

Download regular release, or preview version

  • Control R version
  • vim editor
  • Change theme
  • Version control with Git or SVN
  • Create a project and an R script
  • Save workspace and changed script
  • Documents
    • pdf: LaTex, Sweave, Knitr
    • html: R markdown, Rpres, Notebook
  • Web application: Shiny

R studio

Useful shortcut for writing R script:

  • Ctrl+Space or Tab: show information of the function
  • Tab: give you the details of the argument
  • Ctrl+Shift+c: add or remove comment “#”
  • While executing code, the environment tab shows the objects in your current environment
  • Ctrl+R or Ctrl+Enter: executing the code
  • Source and Source on save: useful for editing function
  • Debuging (more details on Thursday)

Good Coding Habits

  • Indentation
  • Assignment (use <- , not = )
  • Line Length (80 characters prefered)
  • Comment your code
  • Naming Convention

Naming Convention

Be consistent!

  • all lower case: searchpaths

  • period separated: as.numeric, read.table

  • underscore separated: package_version

  • lower camel case (suggested): colSums, sessionInfo

  • Upper camel case: Vectorize, NextMethod